fix(api): isolate side-effect session writes in multimodal and RAG handlers#38210
fix(api): isolate side-effect session writes in multimodal and RAG handlers#38210agarwalpranav0711 wants to merge 6 commits into
Conversation
…ndlers Extends the session isolation fix from langgenius#37895 to two files that were missed: base_app_runner.py and index_tool_callback_handler.py. Both files used db.session.commit() on the shared Flask request-scoped session to persist side-effect writes (a MessageFile during multimodal LLM streaming, and DatasetQuery audit logs / hit_count updates during RAG retrieval). This flushed all pending ORM changes in the request transaction, not just the intended record, risking premature commits and DetachedInstanceError under concurrency. Both now use independent sessionmaker sessions matching the established pattern from langgenius#37895. Fixes langgenius#38176
Pyrefly Diffbase → PR--- /tmp/pyrefly_base.txt 2026-07-03 07:09:49.851446131 +0000
+++ /tmp/pyrefly_pr.txt 2026-07-03 07:09:34.585416161 +0000
@@ -4261,17 +4261,17 @@
ERROR Object of class `NoneType` has no attribute `status` [missing-attribute]
--> tests/unit_tests/core/base/test_app_generator_tts_publisher.py:188:16
ERROR Argument `None` is not assignable to parameter `app_id` with type `str` in function `core.callback_handler.index_tool_callback_handler.DatasetIndexToolCallbackHandler.__init__` [bad-argument-type]
- --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:63:20
+ --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:87:20
ERROR Argument `None` is not assignable to parameter `message_id` with type `str` in function `core.callback_handler.index_tool_callback_handler.DatasetIndexToolCallbackHandler.__init__` [bad-argument-type]
- --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:64:24
+ --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:88:24
ERROR Argument `None` is not assignable to parameter `user_id` with type `str` in function `core.callback_handler.index_tool_callback_handler.DatasetIndexToolCallbackHandler.__init__` [bad-argument-type]
- --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:65:21
+ --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:89:21
ERROR Argument `None` is not assignable to parameter `invoke_from` with type `InvokeFrom` in function `core.callback_handler.index_tool_callback_handler.DatasetIndexToolCallbackHandler.__init__` [bad-argument-type]
- --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:66:25
+ --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:90:25
ERROR Argument `None` is not assignable to parameter `query` with type `str` in function `core.callback_handler.index_tool_callback_handler.DatasetIndexToolCallbackHandler.on_query` [bad-argument-type]
- --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:69:26
+ --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:93:26
ERROR Argument `None` is not assignable to parameter `dataset_id` with type `str` in function `core.callback_handler.index_tool_callback_handler.DatasetIndexToolCallbackHandler.on_query` [bad-argument-type]
- --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:69:32
+ --> tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py:93:32
ERROR Argument `list[DummyToolInvokeMessage]` is not assignable to parameter `tool_outputs` with type `Iterable[ToolInvokeMessage]` in function `core.callback_handler.workflow_tool_callback_handler.DifyWorkflowCallbackHandler.on_tool_execution` [bad-argument-type]
--> tests/unit_tests/core/callback_handler/test_workflow_tool_callback_handler.py:55:30
ERROR Argument `list[DummyToolInvokeMessage]` is not assignable to parameter `tool_outputs` with type `Iterable[ToolInvokeMessage]` in function `core.callback_handler.workflow_tool_callback_handler.DifyWorkflowCallbackHandler.on_tool_execution` [bad-argument-type]
|
Pyrefly Type Coverage
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a transactional safety issue in the API by isolating “side-effect” ORM writes (audit logging, analytics counters, and message file persistence) from the shared Flask request-scoped db.session, preventing unintended commits/flushes of unrelated pending changes during multimodal streaming and RAG callbacks.
Changes:
- Persist
MessageFileinAppRunner._handle_multimodal_image_content()using an independent SQLAlchemy session (sessionmaker(...).begin()), instead ofdb.session.commit(). - Persist
DatasetQueryaudit rows inDatasetIndexToolCallbackHandler.on_query()using an independent session, and updateDocumentSegment.hit_countinon_tool_end()using an independentSession. - Update/add unit tests to assert the new session-isolation behavior (i.e., that
db.session.commit()is not used for these side effects).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| api/core/app/apps/base_app_runner.py | Writes MessageFile via an independent session to avoid committing/closing the request-scoped session during multimodal streaming. |
| api/core/callback_handler/index_tool_callback_handler.py | Isolates audit log persistence and hit-count updates into independent sessions to avoid mutating caller transaction boundaries. |
| api/tests/unit_tests/core/app/apps/chat/test_base_app_runner_multimodal.py | Updates multimodal tests to validate isolated session usage (no db.session.commit() for side effects). |
| api/tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py | Updates callback handler tests to validate isolated session usage for audit/hit-count paths. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Hi, @agarwalpranav0711 please resolve the conflicts. |
|
On it, resolving now. Will push shortly. |
…ct-session-writes # Conflicts: # api/core/callback_handler/index_tool_callback_handler.py # api/tests/unit_tests/core/callback_handler/test_index_tool_callback_handler.py
…ct-session-writes
|
@FFXN Conflicts resolved and pushed. Also synced with the latest main (#38309) — the session-threading refactor there works cleanly alongside our independent-session fix with zero conflicts. All tests passing (1058/1058) across callback_handler, app/apps, rag/retrieval, and tools modules, plus lint and format checks clean. Ready for another look whenever you have time, thanks! |
|
@FFXN @laipz8200 @QuantumGhost Just checking in — this has been open |
Important
Fixes #<issue number>.Summary
Fixes #38176
Extends the session isolation fix from #37895 to two files that were missed:
base_app_runner.pyandindex_tool_callback_handler.py.Both files used
db.session.commit()on the shared Flask request-scoped session to persist side-effect writes (aMessageFileduring multimodal LLM streaming, andDatasetQueryaudit logs /hit_countupdates during RAG retrieval). This flushed all pending ORM changes in the request transaction, not just the intended record, risking premature commits andDetachedInstanceErrorunder concurrency.Changes:
base_app_runner.py:_handle_multimodal_image_content()now persists theMessageFilethrough an independent sessionmaker session instead of the shared session.index_tool_callback_handler.py:on_query()andon_tool_end()now persist audit/analytics writes through an independent session, following the same pattern as fix(api): isolate side-effect database writes #37895.Screenshots
N/A - backend transaction/session fix
Checklist
make lint && make type-check(backend) andcd web && pnpm exec vp staged(frontend) to appease the lint gods